AITopics | hadoop cluster

Collaborating Authors

hadoop cluster

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Three Pitfalls for Data Scientists

#artificialintelligenceDec-2-2020, 08:35:18 GMT

Making mistakes is part of the learning process, and probably there is no way to avoid it. The important thing is to make sure we don't make the same mistake twice. This is not possible if we don't even know we are making a mistake. In the sequel, I discuss three common mistakes regarding the use of data science tools and practices. These mistakes make your work inefficient and may cause unnecessary charges.

big data tool, programming language, python, (14 more...)

#artificialintelligence

Technology:

Information Technology > Artificial Intelligence > Machine Learning (0.52)
Information Technology > Data Science > Data Mining (0.37)
Information Technology > Software > Programming Languages (0.32)

Add feedback

Training multiple ML models and running data tasks in parallel via YARN Spark multithreading

#artificialintelligenceNov-14-2019, 16:23:28 GMT

To objective of this article is to show how a single data scientist can launch dozens or hundreds of data science-related tasks simultaneously (including machine learning model training) without using complex deployment frameworks. In fact, the tasks can be launched from a "data scientist"-friendly interface, namely, a single Python script which can be run from an interactive shell such as Jupyter, Spyder or Cloudera Workbench. The tasks can be themselves parallelised in order to handle large amounts of data, such that we effectively add a second layer of parallelism. "Data science" and "automation" are two words that invariably go hand-in-hand with each other, as one of the keys goals of machine learning is to allow machines to perform tasks more quickly, with lower cost, and/or better quality than humans. Naturally, it wouldn't make sense for an organization to spend more on tech staff that are supposed to develop and maintain systems that automate work (data scientists, data engineers, DevOps engineers, software engineers and others) than on the staff that do the work manually.

controller task, data scientist, subordinate task, (15 more...)

#artificialintelligence

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Data Science > Data Mining > Big Data (0.34)

Add feedback

Twitter removes storage bottlenecks, speeds up Hadoop analytics by 50%

#artificialintelligenceOct-3-2019, 02:47:48 GMT

Think it's hard keeping up with your Twitter feed? Imagine keeping track of all of Twitter. "Every tweet is comprised of over 100 data points," says Matt Singer, a senior staff hardware engineer responsible for server architecture at Twitter. Data from every retweet, "unfollow", link-click and other actions feeds analytic and deep learning systems serving operational, advertising. How does an organization handle such hyper-scale demands?

hadoop cluster, hard drive, twitter, (13 more...)

#artificialintelligence

Country: North America > United States > Arizona (0.05)

Industry: Information Technology > Services (1.00)

Technology:

Information Technology > Communications > Social Media (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.89)
Information Technology > Data Science > Data Mining > Big Data (0.77)

Add feedback

Machine Learning model deployment

#artificialintelligenceOct-1-2019, 23:35:14 GMT

"Enterprise Machine Learning requires looking at the big picture […] from a data engineering and a data platform perspective," lectured Justin Norman during the talk on the deployment of Machine Learning models at this year's DataWorks Summit in Barcelona. Indeed, an industrial Machine Learning system is a part of a vast data infrastructure, which renders an end-to-end ML workflow particularly complex. The challenges linked to the development, deployment, and maintenance of the real-world ML systems should not be overlooked as we pursue the finest ML algorithms. Machine Learning is not necessarily meant to replace human decision making, it is mainly about helping humans make complex judgment base decisions. The talk I attended, Machine Learning Model Deployment: Strategy to Implementation, was given by Cloudera's experts, Justin Norman and Sagar Kewalramani. They gave a presentation on the challenges encountered by an end-to-end ML workflow, focusing on delivering Machine Learning to production.

deployment, ml model, ml workflow, (12 more...)

#artificialintelligence

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Data Science > Data Mining > Big Data (0.96)

Add feedback

The New Data Capitalist

#artificialintelligenceOct-6-2018, 05:22:43 GMT

Data is the real capital that's driving the new digital economy. Just consider any successful social media platform or consumer web service: these companies may be short on facilities and capital equipment, but they're rich in intellectual property, thanks to their ability to slice and dice their proprietary reserves of data capital for competitive advantage. Established companies can see similar opportunities, but only if they learn to unlock the full value of their information reserves. Insights into new business opportunities remain hidden because internal data consumers want new combinations of data and analyses not found in standard reports and dashboards. This is the unseen data that's hiding inside every company. How can enterprises bring data capital out of hiding?

artificial intelligence, cloud computing, data capital, (8 more...)

#artificialintelligence

Industry:

Information Technology > Security & Privacy (0.51)
Law > Intellectual Property & Technology Law (0.36)
Information Technology > Services (0.31)

Technology:

Information Technology > Cloud Computing (1.00)
Information Technology > Artificial Intelligence (0.77)

Add feedback

Open Sourcing TonY: Native Support of TensorFlow on Hadoop

#artificialintelligenceSep-16-2018, 18:59:54 GMT

LinkedIn heavily relies on artificial intelligence to deliver content and create economic opportunities for its 575 million members. Following recent rapid advances of deep learning technologies, our AI engineers have started adopting deep neural networks in LinkedIn's relevance-driven products, including feeds and smart-replies. Many of these use cases are built on TensorFlow, a popular deep learning framework written by Google. In the beginning, our internal TensorFlow users ran the framework on small and unmanaged "bare metal" clusters. But we quickly realized the need to connect TensorFlow to the massive compute and storage power of our Hadoop-based big data platform.

artificial intelligence, data mining, machine learning, (14 more...)

#artificialintelligence

Technology:

Information Technology > Data Science > Data Mining > Big Data (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Water Co. Exploring Use of ML to Detect Quality Issues

#artificialintelligenceAug-19-2018, 03:02:12 GMT

Everybody expects to have clean drinking water. But as the lead crises in Michigan has shown, that's not always the case. Now American Water, the largest publicly traded water company in the country, is actively researching the use of machine learning and real-time streaming data technology to detect and identify potentially harmful chemical signatures in its surface drinking water supply. The company is in the early stages of building such a machine learning system. But according to American Water Senior Technologist John Kuchmek, the potential benefits of training machine learning models on real-time water quality data collected by remote sensors are too great to ignore.

artificial intelligence, data mining, machine learning, (17 more...)

#artificialintelligence

Country:

North America > United States > Michigan (0.25)
North America > United States > New Jersey (0.05)
North America > United States > Missouri (0.05)

Industry: Water & Waste Management > Water Management > Water Supplies & Services (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Data Science > Data Mining > Big Data (0.74)

Add feedback

Best Big Data Hadoop Architect- Hadoop Online Courses Simpliv

#artificialintelligenceJul-2-2018, 17:51:15 GMT

Record and run settings a team which includes 2 Stanford-educated, ex-Googlers and 2 ex-Flipkart Lead Analysts. This team has decades of practical experience in working with large-scale data processing jobs. Relational Databases are so stuffy and old! Welcome to HBase – a database solution for a new age. HBase: Do you feel like your relational database is not giving you the flexibility you need anymore?

architect-hadoop online course simpliv, artificial intelligence, data mining, (13 more...)

#artificialintelligence

Genre: Instructional Material > Course Syllabus & Notes (0.35)

Industry:

Education > Educational Setting > Online (1.00)
Education > Educational Technology > Educational Software > Computer Based Training (0.40)

Technology:

Information Technology > Data Science > Data Mining > Big Data (1.00)
Information Technology > Artificial Intelligence (1.00)

Add feedback

Options for Deploying Machine Learning Algorithms to AWS

#artificialintelligenceJun-29-2018, 18:31:54 GMT

AWS is a great place for accessing scalable, cheap resources on which to deploy data models. However, actually using AWS for this purpose can be challenging. If you didn't begin your project on AWS, you have to figure out a way to migrate it there. In addition, you have to determine how to handle the dataset against which you run your algorithm: should you move all of that data into AWS (and deal with the privacy challenges that this raises), just stream the data (which is not cheap), or do something else? In this article, we'll examine different solutions for working with data models on AWS.

activepython ami, artificial intelligence, deploying machine learning algorithm, (10 more...)

#artificialintelligence

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Strata 2017 Postmortem: More virtual data lake, more operational machine learning ZDNet

@machinelearnbotOct-2-2017, 17:15:22 GMT

There was no shortage of AI in the agenda at Strata. Beyond the headlines, there was growing evidence that the big data community is starting to get serious about real time processing.

artificial intelligence, data lake, data mining, (16 more...)

@machinelearnbot

Industry: Information Technology (0.31)

Technology:

Information Technology > Data Science > Data Mining > Big Data (1.00)
Information Technology > Artificial Intelligence (1.00)

Add feedback